Skip to content

Conversation

@liran-funaro
Copy link
Contributor

@liran-funaro liran-funaro commented Jul 10, 2025

Type of change

  • Bug fix
  • Improvement (improvement to code, performance, etc)
  • Breaking change

Description

  • [API] Modify naming and numbering of status codes to form groups:
    • committed/aborted (registered in the state DB)
    • malformed/duplicated-ID (cannot be registered in the state DB)
    • malformed (registered in the state DB)
  • [sidecar] Move well-formed TX check to the sidecar
  • [sidecar] Verify namespace TX's public key
  • [sidecar] Verify config block and meta-namespace's public key
  • [sidecar] Verify TX ID matches the envelope TX ID (temporary workaround)
  • [utils] Use TX ID as the envelope ID (temporary workaround)
  • [relay] Fix a bug that used the wrong TX index in the block
  • [relay] Fix a bug that caused the system to hang when the internal Goroutines in preProcessBlockAndSendToCoordinator() fail.
    • Split preProcessBlockAndSendToCoordinator() to preProcessBlock() and sendBlocksToCoordinator() to have a shared context with all Goroutines
  • [coordinator] Add metrics for all statuses

Related issues

@liran-funaro liran-funaro requested a review from cendhu July 10, 2025 10:06
@liran-funaro liran-funaro added bug Something isn't working breaking labels Jul 10, 2025
@liran-funaro liran-funaro force-pushed the verify-form-sidecar branch 2 times, most recently from 0cc10fa to 4872c83 Compare July 14, 2025 10:01
@liran-funaro liran-funaro marked this pull request as draft July 16, 2025 13:23
@liran-funaro liran-funaro force-pushed the verify-form-sidecar branch 5 times, most recently from 62bf9e2 to 5091087 Compare July 16, 2025 15:07
@liran-funaro liran-funaro marked this pull request as ready for review July 16, 2025 15:08
@liran-funaro liran-funaro force-pushed the verify-form-sidecar branch from 5091087 to f9507c6 Compare July 16, 2025 15:13
if !IsStatusStoredInDB(status) {
err := b.withStatus.setFinalStatus(txNum, status)
if err != nil {
// This should never happen unless there is a bug in the mapper.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the bug results in a deterministic handling across various instances, it is not a problem. Otherwise, a fork can occur. Hence, why not terminate the sidecar when a bug is identified as it can be serious. Otherwise, we can remove the following lines:

	if b.txStatus[txNum] != statusNotYetValidated {
		// This can never occur unless there is a bug in the relay or the coordinator.
		return errors.Newf("two results for a TX [blockNum: %d, txNum: %d]", b.block.Header.Number, txNum)
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I agree we should terminate on a bug. We should terminate here as well.
The issue is having to propagate the error only for a case that should never happen.
Will it be acceptable to panic in such a case? Using logger.Fatal()? What do you think?

Copy link
Contributor

@cendhu cendhu Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I am fine with a panic here though the effective go recommendation is to propagate the error to the main. At least, we should print the stack trace here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with propagation to avoid panicking on tests.

Comment on lines 94 to 98
b.rejectTx(msgIndex, hdr, protoblocktx.Status_MALFORMED_BAD_ENVELOPE, envErr.Error())
return
}
if hdr.TxId == "" {
b.rejectTx(msgIndex, hdr, protoblocktx.Status_MALFORMED_MISSING_TX_ID, "no TX ID")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we directly set the final status here itself rather than calling rejectTx(). Otherwise, it is possible to callhdr.TxID on nil. Hence, I feel this is bit fragile. In fact, the debug() in IsStatusStoredInDB is accessing hdr.TxID and can result in panic if debug log is enabled. If you want to retain the call to rejectTx(), there has to be some validation on the func parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we directly set the final status here itself rather than calling rejectTx().

The reasoning for using rejectTx() instead of directly setting the final status is to relieve the developer from the burden of understanding when we call "reject" and when we set the status directly.
Having this logic inside rejectTx() makes the reasoning self-explanatory.

it is possible to callhdr.TxID on nil

When data, hdr, envErr := serialization.UnwrapEnvelope(msg) returns without an error, hdr is never nil. So the error you refer to cannot happen.

I feel this is bit fragile. In fact, the debug() in IsStatusStoredInDB is accessing hdr.TxID and can result in panic if debug log is enabled. If you want to retain the call to rejectTx(), there has to be some validation on the func parameters.

debugTx() handles the case of nil header.
The only "fragile" case is when we reject with MALFORMED_BAD_ENVELOPE. In such a case, hdr==nil, and we rely on the fact that this status is not stored in the DB.

To make the code more robust, I added the rejectNonDBStatusTx() method and used it when the status is known in advance and it is not stored in the DB. This also eliminates the recursion.

Copy link
Contributor

@cendhu cendhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it looks good to me. A few minor issues need to be addressed.

Signed-off-by: Liran Funaro <liran.funaro@gmail.com>
Signed-off-by: Liran Funaro <liran.funaro@gmail.com>
@liran-funaro liran-funaro force-pushed the verify-form-sidecar branch from c2f5f0c to 4d811a9 Compare July 29, 2025 08:18
@liran-funaro liran-funaro requested a review from cendhu July 29, 2025 08:46
Copy link
Contributor

@cendhu cendhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@cendhu cendhu merged commit 57b2bf7 into hyperledger:main Jul 30, 2025
17 of 18 checks passed
@liran-funaro liran-funaro deleted the verify-form-sidecar branch July 30, 2025 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking bug Something isn't working

Projects

None yet

2 participants